A Comparison of Neural Networks for Real-Time Emotion Recognition from Speech Signals
نویسندگان
چکیده
Speech and emotion recognition improve the quality of human computer interaction and allow easier to use interfaces for every level of user in software applications. In this study, we have developed two different neural networks called emotion recognition neural network (ERNN) and Gram-Charlier emotion recognition neural network (GERNN) to classify the voice signals for emotion recognition. The ERNN has 128 input nodes, 20 hidden neurons, and three summing output nodes. A set of 97920 training sets is used to train the ERNN. A new set of 24480 testing sets is utilized to test the ERNN performance. The samples tested for voice recognition are acquired from the movies “Anger Management” and “Pick of Destiny”. ERNN achieves an average recognition performance of 100%. This high level of recognition suggests that the ERNN is a promising method for emotion recognition in computer applications. Furthermore, the GERNN has four input nodes, 20 hidden neurons, and three output nodes. The GERNN achieves an average recognition performance of 33%. This shows us that we cannot use Gram-Charlier coefficients to discriminate emotion signals. In addition, Hinton diagrams were utilized to display the optimality of ERNN weights. Key-Words: Back propagation learning algorithm, Neural network, Emotion, Speech, Power Spectrum, FastFourier Transform (FFT), Bayes optimal decision rule.
منابع مشابه
Speech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملA hybrid EEG-based emotion recognition approach using Wavelet Convolutional Neural Networks (WCNN) and support vector machine
Nowadays, deep learning and convolutional neural networks (CNNs) have become widespread tools in many biomedical engineering studies. CNN is an end-to-end tool which makes processing procedure integrated, but in some situations, this processing tool requires to be fused with machine learning methods to be more accurate. In this paper, a hybrid approach based on deep features extracted from Wave...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملClassification of ECG signals using Hermite functions and MLP neural networks
Classification of heart arrhythmia is an important step in developing devices for monitoring the health of individuals. This paper proposes a three module system for classification of electrocardiogram (ECG) beats. These modules are: denoising module, feature extraction module and a classification module. In the first module the stationary wavelet transform (SWF) is used for noise reduction of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009